Characteristics of Subtype and Molecular Transmission Networks among Newly Diagnosed HIV-1 Infections in Patients Residing in Taiyuan City, Shanxi Province, China, from 2021 to 2023

The HIV-1 pandemic, spanning four decades, presents a significant challenge to global public health. This study aimed to understand the molecular transmission characteristics of newly reported HIV infections in Taiyuan, Shanxi Province, China, to analyze the characteristics of subtypes and the risk factors of the transmission network, providing a scientific basis for precise prevention and intervention measures. A total of 720 samples were collected from newly diagnosed HIV-1 patients residing in Taiyuan between 2021 and 2023. Sequencing of partial genes of the HIV-1 pol gene resulted in multiple sequence acquisitions and was conducted to analyze their subtypes and molecular transmission networks. Out of the samples, 584 pol sequences were obtained, revealing 17 HIV-1 subtypes, with CRF07_BC (48.29%), CRF01_AE (31.34%), and CRF79_0107 (7.19%) being the dominant subtypes. Using a genetic distance threshold of 1.5%, 49 molecular transmission clusters were generated from the 313 pol gene sequences. Univariate analysis showed significant differences in the HIV transmission molecular network in terms of HIV subtype and household registration (p < 0.05). Multivariate logistic regression analysis showed that CRF79_0107 subtype and its migrants were associated with higher proportions of sequences in the HIV transmission network. These findings provide a scientific foundation for the development of localized HIV-specific intervention strategies.


Introduction
Human immunodeficiency virus (HIV) has been a serious threat to global public health and disease surveillance systems for the past 40 years [1][2][3][4].HIV can be categorized into two primary types: HIV-1, which is the primary causative agent for the dissemination of acquired immunodeficiency syndrome (AIDS), and HIV-2 [5].Furthermore, HIV-1 is widespread globally and is the primary type of infection in China.In contrast, HIV-2 is primarily prevalent in countries in West Asia [6].Among the four phylogeny groups that constitute HIV-1 (M, N, O, P), the M group is considered the primary virulence factor in the AIDS pandemic [7].The M group includes nine major subtypes (i.e., A-D, F-H, J, and K), over 110 circulating recombinant forms (CRFs), and numerous unique recombinant forms (URFs) [8].Due to genetic variation and cross-recombination of genes between different genotypes, numerous new CRFs and unique recombinant forms (URFs) are continually Viruses 2024, 16, 1174 2 of 18 emerging, significantly altering the global epidemic landscape of HIV.According to the World Health Organization (https://www.who.int/(accessed on 6 April 2024)), the estimated number of persons living with HIV/AIDS (PLWH) globally by the end of 2022 was 39 million, with 1.3 million new infections and approximately 630,000 people dying annually from AIDS-related illnesses.Despite the implementation of the 9'0-90-90' prevention strategy proposed by the Joint United Nations Program on HIV/AIDS (UNAIDS) [9,10], China continues to face a high prevalence of HIV-1 subtypes, presenting a complex and diverse challenge with significant obstacles and pressures [11].
Since the discovery of the first HIV-1 infection in China in 1985 [12], the number of PLWH has been increasing year after year.By the end of 2023, the number of PLWH in China was 1,289,700, with 110,491 newly diagnosed in 2023 [13].The distributional characteristics and epidemiological trends of HIV-1 subtypes are experiencing significant changes concurrent with the rapid rise in the number of PLWH [11].Four national HIV molecular epidemiological surveys have been conducted; the latest in 2016, showed a rapid increase in HIV-1 subtypes [14].More than 20 HIV-1 subtypes have been identified, with CRF07_BC, CRF01_AE, CRF08_BC, and B being the most prevalent epidemic subtypes [15].Since the 1990s, CRF07_BC and CRF01_AE have continuously evolved in China, leading to a widespread epidemic among various risk groups, such as men who have sex with men (MSM), commercial heterosexual individuals, and people who inject drugs (PWID) [16][17][18][19][20][21].Newly reported circulating recombinant form strains (CRFs) and unique recombinant form strains (URFs) are frequently reported within specific groups [22][23][24][25].
The molecular networks utilize the genetic similarities between viral sequences to achieve the spread of the virus.Previously, researchers relied primarily on questionnaires and partner tracing to construct social or sexual transmission networks.Subsequently, they utilized genetic information from HIV sequences to develop molecular transmission networks [26][27][28].Zhang et al. employed molecular networks to quantify HIV transmission patterns in Hangzhou, thereby offering local evidence for the development of precise HIV prevention strategies [29].He et al. used molecular networks to analyze HIV-1 transmission clusters in Guangxi, identifying distinct transmission networks and clusters [30].The utilization of molecular networks for constructing a macro-propagation network for PLWH is more precise and has been extensively used to validate the findings and conclusions of epidemiological field investigations.This approach can effectively mitigate information disparities and enhance the credibility of conclusions [31][32][33].
Taiyuan City, comprising 10 counties (districts/cities), is the capital of Shanxi Province, China.Positioned strategically in the heart of Shanxi Province, Taiyuan City serves as a significant transportation hub in northern China, attracting a large population flow.To gain a comprehensive understanding of the distribution and characteristics of HIV-1 subtypes in Taiyuan, we utilized phylogenetic inference, transmission network analysis, and molecular epidemiological data to investigate newly diagnosed HIV-1 infections between 2021 and 2023.

Population Study and Sample Collection
Patients newly diagnosed with HIV-1 infection, residing in Taiyuan between 2021 and 2023, were included in the study.The samples were selected based on the following criteria: (1) serum or plasma samples collected by the Taiyuan Center for Disease Control and Prevention (Taiyuan CDC) testing positive for HIV-1 using Western blot testing; (2) participants must reside, work, or be employed in Taiyuan, regardless of their household registration address; and (3) the household registration address is determined based on the information from the ID card.The epidemiological data (including, gender, ethnicity, age, marital status, education background, occupation, transmission route, etc.) were acquired from the China Information System for Disease Control and Prevention.

HIV-1 RNA Extraction, Amplification, and Sequencing
Viral RNA was extracted from specimens using a MagMAX™-96 Viral RNA Isolation Kit (Thermo Fisher Scientific, Foster City, CA, USA), according to the manufacturer's instructions.RNA was eluted with 50 µL of elution buffer and used immediately.Reverse transcriptase-polymerase chain reaction (RT-PCR) and nested PCR were used to amplify the HIV-1 partial polymerase (pol) gene (1.3 kp, HXB2:2253-3553).PrimeScript TM One Step Reverse Transcriptase-Polymerase Chain Reaction Kit Version 2 (Takara, Dalian, China) and the Ex Taq Kit (Takara, Dalian, China) were employed in the amplification.The primer sequences and thermal cycling conditions were described previously [34].The PCR products were analyzed using 1% agarose gel electrophoresis.The target PCR products were sent to Tsingke Biotechnology Co, Ltd. for purification and sequencing using an ABI 3730XL DNA sequencer (Applied Biosystems, Carlsbad, CA, USA), with five overlapping primers (https://www.chinacdc.cn/(accessed on 15 June 2023)).

HIV-1 Genotyping, Phylogenetic Analysis, and Amino Acid Difference Analysis
Sequences were assembled using Sequencher 5.4.6 (Gene Codes, AnnArbor, MI).BioEdit (version 7.0.9)software (http://www.mbio.ncsu.edu/BioEdit/(accessed on 15 February 2024)) was used for sequence alignment.HIV-1 subtypes were determined using pol sequences by the Los Alamos National Laboratory HIV Database (http://www.hiv.lanl.gov(accessed on 25 February 2024)) and subsequently clarified by phylogenetic analysis.Neighbor-joining phylogenetic trees were constructed using the Kimura two-parameter model with 1000 bootstrap replicates in MEGA software (Version 11.0.11); the check value was 70% to identify the subtype and to control for potential laboratory and sample contamination [35].We used the function of computing pairwise distance (P-distance) and determined average within-group and between-group distance in MEGA 11.0.11 in order to distinguish differences among all subtypes.Subtype reference alignments were downloaded from HIV databases (www.hiv.lanl.gov(accessed on 25 February 2024)), including 41 HIV-1 group M subtypes and 45 common CRFs, at home and abroad.

Molecular Transmission Network Analysis
The pairwise genetic distances were estimated using the Tamura-Nei 93 (TN93) fast distance calculator, and the HIV molecular network was created using the HIV-TRACE tool, with a threshold genetic distance of 0.015 among HIV-1 subtypes [36].All nodes in the HIV molecular network were assigned with epidemiological data, and molecular network maps were generated.In this study, clusters with five or more nodes were defined as larger clusters.Utilizing the connections among various risk behavior groups within the network, we constructed Sankey diagrams to depict the links of different HIV-1 subtypes among the various risk groups in the network.

Statistical Analysis
All data were entered into Microsoft Excel 2010 (Microsoft Corporation; Redmond, WA, USA).Statistical analysis was conducted using SPSS 26.0 (SPSS, Inc; Chicago, IL, USA).Statistical comparisons were performed using Fisher's exact test, Chi-square testing, and multivariate logistic regression, which are used to identify the influencing factors that are associated with inclusion in the clusters in molecular transmission networks among the participants.A p-value less than 0.05 (typically ≤0.05) is statistically significant.

Ethical Statement
This study was approved by the Research Ethics Review Committee of the Taiyuan CDC (Approval ID: 2023020).All procedures were performed following the guidelines of the Declaration of Helsinki, as well as international and national laws, regulations, and guidelines for human studies.
The overall distribution characteristics of epidemic subtypes in Taiyuan are shown in Table 2. Gender and education differences among subtypes were the statistically significant factors (p < 0.05).The infection primarily affected males, accounting for 90.6% (529/584).CRF07_BC primarily infected individuals aged 50 and above, accounting for 57.1% (97/170) of the age groups.Similarly, those with a college education or higher also exhibit a high prevalence of the CRF07_BC subtype, accounting for 46.5% (114/245).Compared to local individuals, migrants exhibited a higher proportion of other subtypes.Heterosexual transmission remained the primary transmission route, accounting for 49.0% (179/365) of CRF07_BC-infected individuals and 30.7% (112/365) of CRF01_AE-infected individuals.In addition, Figure 3 presents the trend analysis of each subtype from 2021 to 2023, along with the research findings regarding the trend of subtype changes among migrants and locals during the same period.Interesting trends emerge from a closer inspection of the data.In 2021, the prevalence of CRF07_BC increased among the study participants and migrants, yet local CRF07_BC and CRF01_AE proportions were stable.Subsequently, from 2022 to 2023, there was a rise in CRF07_BC observed in both the local and overall data.The CRF07_BC trend among immigrants consistently reflected this overall trend.
In addition, Figure 3 presents the trend analysis of each subtype from 2021 to 2023, along with the research findings regarding the trend of subtype changes among migrants and locals during the same period.Interesting trends emerge from a closer inspection of the data.In 2021, the prevalence of CRF07_BC increased among the study participants and migrants, yet local CRF07_BC and CRF01_AE proportions were stable.Subsequently, from 2022 to 2023, there was a rise in CRF07_BC observed in both the local and overall data.The CRF07_BC trend among immigrants consistently reflected this overall trend.We calculated the average differences in amino acids and nucleotides among various subtypes, as well as within them (Supplementary Table S1).The results indicate that the average nucleotide differences among HIV-1 subtypes in Taiyuan from 2021 to 2023 ranged from 0.4% to 7.4%, with the URF subtype showing even larger differences, surpassing the average significantly.Likewise, the average difference in amino acids among the URF subtype also greatly exceeds the norm.

HIV-1 Molecular Transmission Network
At a genetic distance of a 1.5% gene distance threshold, the transmission cluster analysis identified 313 individuals (43.5%, 313/720) within the transmission network, revealing 50 HIV transmission clusters with sizes ranging from 2 to 133 nodes.A logistic regression model was used to analyze the influencing factor of a node degree ≥ 2 on the network.We found that compared to locals, individuals who were migrants had significantly higher odds of clustering (adjusted odds ratio [AOR] 2.124; 95% confidence interval: 1.457-3.095,p < 0.001).Individuals within clusters had a higher likelihood of belonging to the subtype CRF79_0107 (AOR: 4.270 [2.487-7.334],p < 0.001).See Table 3 for details.
Among these clusters, the CRF07_BC network exhibited a higher number of newly diagnosed HIV-1 infections compared to the CRF01_AE network, with the latter forming multiple smaller clusters (Figure 4).Among the nine clusters with five or more nodes detailed in Table 4, four clusters were associated with CRF07_BC, two with CRF01_AE, and one cluster each with CRF79_0107, CRF107_01B, and URF.Of the 219 individuals within the nine larger clusters, the majority (64.4%, 141/219) were infected through heterosexual transmission.The sequences of CRF07_BC formed the largest HIV-1 transmission cluster, CRF07_BC_Cluster1, comprising 58 MSM, 73 heterosexual individuals, and 2 IDUs.The correlation among MSM within this cluster was the highest.CRF07_BC_Cluster3 and CRF07_BC_Cluster8 exhibited a preponderance of elderly individuals, with heterosexual transmission being the primary route.Furthermore, for CRF79_0107_Cluster2, the molecular network analysis revealed a significant cluster comprising 27 individuals.Among these individuals, 19 were locals, while 8 were migrants.Moreover, the two CRF01_AE clusters, Cluster5 and Cluster6, each comprised eight individuals, all of whom were heterosexual males.In addition to these, we also identified CRF107_01B, mostly comprised of local males, with 40% (2/5) and 60% (3/5) heterosexual and homosexual transmission, respectively.
Viruses 2024, 16, x FOR PEER REVIEW 12 of 19 Among these clusters, the CRF07_BC network exhibited a higher number of newly diagnosed HIV-1 infections compared to the CRF01_AE network, with the latter forming multiple smaller clusters (Figure 4).Among the nine clusters with five or more nodes detailed in Table 4, four clusters were associated with CRF07_BC, two with CRF01_AE, and one cluster each with CRF79_0107, CRF107_01B, and URF.Of the 219 individuals within the nine larger clusters, the majority (64.4%, 141/219) were infected through heterosexual transmission.The sequences of CRF07_BC formed the largest HIV-1 transmission cluster, CRF07_BC_Cluster1, comprising 58 MSM, 73 heterosexual individuals, and 2 IDUs.The correlation among MSM within this cluster was the highest.CRF07_BC_Cluster3 and CRF07_BC_Cluster8 exhibited a preponderance of elderly individuals, with heterosexual transmission being the primary route.Furthermore, for CRF79_0107_Cluster2, the molecular network analysis revealed a significant cluster comprising 27 individuals.Among these individuals, 19 were locals, while 8 were migrants.Moreover, the two CRF01_AE clusters, Cluster5 and Cluster6, each comprised eight individuals, all of whom were heterosexual males.In addition to these, we also identified CRF107_01B, mostly comprised of local males, with 40% (2/5) and 60% (3/5) heterosexual and homosexual transmission, respectively.Among these clusters, the CRF07_BC network exhibited a higher number of newly diagnosed HIV-1 infections compared to the CRF01_AE network, with the latter forming multiple smaller clusters (Figure 4).Among the nine clusters with five or more nodes detailed in Table 4, four clusters were associated with CRF07_BC, two with CRF01_AE, and one cluster each with CRF79_0107, CRF107_01B, and URF.Of the 219 individuals within the nine larger clusters, the majority (64.4%, 141/219) were infected through heterosexual transmission.The sequences of CRF07_BC formed the largest HIV-1 transmission cluster, CRF07_BC_Cluster1, comprising 58 MSM, 73 heterosexual individuals, and 2 IDUs.The correlation among MSM within this cluster was the highest.CRF07_BC_Cluster3 and CRF07_BC_Cluster8 exhibited a preponderance of elderly individuals, with heterosexual transmission being the primary route.Furthermore, for CRF79_0107_Cluster2, the molecular network analysis revealed a significant cluster comprising 27 individuals.Among these individuals, 19 were locals, while 8 were migrants.Moreover, the two CRF01_AE clusters, Cluster5 and Cluster6, each comprised eight individuals, all of whom were heterosexual males.In addition to these, we also identified CRF107_01B, mostly comprised of local males, with 40% (2/5) and 60% (3/5) heterosexual and homosexual transmission, respectively.From the data in Figure 5, it is apparent that in the CRF07_BC network (which included 965 links), the correlation between heterosexual transmission within the same risk group was the highest, accounting for 44.5% (429/965); among all cross-risk groups, MSM had the highest correlation with the heterosexual group, accounting for 18.92% (200/965).We delved into the characteristics of the HIV-1 subtypes among these different risk groups within the network.The CRF01_AE network and the other subtypes network show the same relationship between the various risk groups in the network.From the data in Figure 5, it is apparent that in the CRF07_BC network (which included 965 links), the correlation between heterosexual transmission within the same risk group was the highest, accounting for 44.5% (429/965); among all cross-risk groups, MSM had the highest correlation with the heterosexual group, accounting for 18.92% (200/965).We delved into the characteristics of the HIV-1 subtypes among these different risk groups within the network.The CRF01_AE network and the other subtypes network show the same relationship between the various risk groups in the network.

Discussion
In this study, we collected samples of newly diagnosed HIV-1 infections from patients residing in Taiyuan from 2021 to 2023.Subsequently, we carried out a detailed molecular epidemiological study, which included analyses of phylogenetics, transmission characteristics, and risk factors.The objective of this study was to investigate newly diagnosed HIV-1 infections and conduct a comprehensive analysis of the local transmission characteristics and epidemic patterns of HIV-1 subtypes across various risk groups.
Most HIV-1 infections in Taiyuan are diagnosed in male patients, which is consistent with the results of studies conducted in other provinces of China [37,38].Over the years, there has been a gradual increase in female infections, with heterosexual sex being the main transmission route.Epidemiological data indicate that sexual transmission has emerged as the primary route of HIV-1 transmission, bringing the disease to the general population and presenting a significant challenge to HIV/AIDS prevention and control in China [21,[39][40][41].
Our study finds that the majority of new HIV-1 infections occur in individuals aged 50 years or above, a proportion that appears to be higher than that noted in previous studies [42].Newly infected individuals aged 50 and above account for 27.8% (200/720), ranking first among the age groups; this result is likely attributed to the lack of self-protection awareness among the elderly, as there is a low rate of condom use and a limited level of knowledge about AIDS among this age group.This situation can potentially lead to high-risk sexual behavior, thereby further increasing the risk of infection and transmission within the elderly population [43,44].Targeted health education and prevention interventions tailored to the behavioral and psychological characteristics of the elderly need to be developed.On the other hand, individuals under 30 years old account for a significant proportion of infections (24.7%, 178/720), with a majority of infections occurring through homosexual transmission; this age group is sexually active and also represents the key population to target for preventing HIV spread.
A total of 17 HIV-1 subtypes were identified.Among the detected subtypes, CRF07_BC represented the predominant subtype.This finding significantly differed from those of the 2016-2017 survey and those from other regions, where CRF01_AE (i.e., Anhui, Liaoning and Guangxi), B (i.e., Henan), and CRF08_BC (i.e., Yunnan) were the dominant subtypes [40,42,[45][46][47].CRF07_BC_N is primarily transmitted through heterosexual behavior, whereas CRF07_BC_N, previously prevalent among MSM, has now shifted its main route of transmission [48].Additionally, CRF07_BC_O was previously primarily transmitted among PWID and heterosexual individuals.In this study, it continued to be primarily transmitted among heterosexuals, aligning with the finding from previous research [48].These findings indicate that this subtype has broadened its transmission to include individuals at risk of sexually transmitted infections.The second subtype, CRF01_AE, has experienced a decrease in proportion compared to the rates noted in previous studies, with different subtypes now emerging.CRF01_AE remains the predominant subtype of HIV-1 in Taiyuan during 2016-2017, and is particularly prevalent among MSM, specifically the C4 and C5 subtypes [42].Recent research indicates that the CRF01_AE subtype continues to be predominantly composed of the C4 and C5 subclusters, followed by the C1, C2, C3, and O subclusters, which are more commonly transmitted by MSM.One unanticipated result of this study is the non-detection of the subtype CRF65_cpx, which was previously found in the MSM population in Taiyuan [49]; this absence suggests that this particular subtype may be a susceptible strain in Taiyuan and could be eradicated through ongoing evolution.Another possibility is that samples of this subtype are not successfully amplified.
However, the phylogenetic tree did not effectively illustrate the relationships between HIV-1 subtypes in Taiyuan and other regions due to the lack of corresponding reference sequences in the reference databases.The strains from Taiyuan were observed to be distributed across various clades in the neighbor-joining trees.Some strains formed a local cluster within the Taiyuan area, while immigrants and locals were not segregated into distinct clusters, but rather intertwined, indicating that there was a complex relationship of HIV-1 transmission between Taiyuan and other regions.The result highlighted that the principal driving force of the HIV/AIDS epidemic in Taiyuan was local infection.This result emphasizes the importance of localized prevention strategies to help refine or devise tailored interventions.
In this study, through a transmission network based on the links of different HIV-1 clusters among various risk groups (Figures 4 and 5), the transmission linkages among various risk groups showed significant differences.Molecular network analysis showed more connections between the route of MSM and heterosexual transmission to be slightly higher than the route of MSM, indicating the association between the two routes of heterosexual transmission and MSM.Within the CRF07_BC_Cluster1 and CRF01_AE _Cluster5 network, the correlation among MSM was the highest, while in other clusters, heterosexual individuals showed stronger correlations, indicating diverse "sources" of the different HIV-1 clusters.Moreover, our molecular network analysis revealed that within some transmission clusters, diverse HIV transmission routes coexist within the same network.Interestingly, a "key person" emerged in various HIV transmission routes: individuals who have sex with men but identify as heterosexual.A possible explanation for this might be that some MSM with newly diagnosed HIV-1 infection may not disclose their true route of transmission during epidemiological surveys due to stigma and discrimination, as well as to the fact that they may also engage in sexual activities with women.The unreliability of self-reported data regarding acquired sexual behaviors in routine HIV/AIDS epidemiological surveys may therefore impact the formulation of effective control measures.To better understand HIV transmission routes, it is essential to employ advanced epidemiological methods, including the collection of detailed transmission information, including the occurrence of engaging in acts such as oral, vaginal, or anal sex.Collaborative efforts, combining molecular network analysis with enhanced field epidemiological surveys, can quantify local HIV transmission.Further research will enable us to identify and address discrepancies, resulting in a clearer understanding of the HIV/AIDS epidemic patterns in Taiyuan and providing valuable insights for prevention and treatment strategies.
Upon further analysis of the molecular network, we found that individuals in clusters were more likely to be household-registered outside of Taiyuan.Migration has been identified as a major factor driving the spread of the HIV epidemic across nations [50].The uneven economic development in Taiyuan may prompt some of the population to seek better employment opportunities and living conditions in other urban centers.These individuals, flowing between their workplace and home, brining not only money, but also HIV-1, along with their higher levels of sexual risk, including their likelihood of participating in unprotected sex [51], could potentially act as a bridge, facilitating viral transmission from other provinces or cities to their home regions.
CRF79_0107 was identified among MSM in Jincheng and Datong cities, Shanxi Province, in 2015 [52].Subsequently, this subtype was also detected in Taiyuan City in 2016, becoming the second largest cluster in the molecular transmission network, thereby playing a crucial role in the transmission process [42].This subtype has also been detected in other regions of China besides Shanxi Province.For instance, it was recently identified in Hangzhou in 2019, where nine cases were found among 857 amplified samples, involving MSM individuals across different age groups [29].Similarly, in April 2019, five cases were identified from 1297 amplified samples in Sichuan Province, all linked to MSM individuals [53].Interestingly, this research reveals that CRF79_0107 can infect women, which is a significant finding.These findings suggest that the spread of CRF79_0107 has extended beyond the MSM individuals to other sexually active populations, indicating a clear provincial-level transmission pattern.The primary focus should be on the comprehensive understanding of the epidemiological characteristics of this subtype clustering, pinpointing key transmission groups, and implementing targeted prevention and control measures.
Notably, molecular network analysis revealed that Cluster7 comprises two CRF08_BC sequences and four URFs.Both neighbor-joining phylogenetic tree analysis and molecular network analysis indicate that these four URFs exhibit similarities with CRF08_BC.Based on the analysis, we tend to classify these four URFs as CRF08_BC.However, upon analysis of the average differences in nucleotides and amino acids, we conclude that URF exhibits a closer relationship with the CRF113_0107 subtype.Therefore, further gene sequencing, as well as functional studies, are needed to verify the association.Furthermore, a larger cluster was formed by CRF107_01B, initially detected in the MSM population in Heilongjiang [54].We hypothesize that the spread of this subtype to Taiyuan was facilitated by population mobility.
Our study offered insights into the local transmission characteristics and epidemic patterns of HIV-1 subtypes within various high-risk behavior groups in Taiyuan.Nevertheless, it has several limitations.Firstly, the molecular network deduced from HIV-1 pol sequences represents only a fraction of the comprehensive local risk behavior network, excluding unsequenced and undiagnosed individuals.Secondly, epidemiological data, particularly in regards to sexual contact methods, rely solely on individual reports, potentially introducing information bias.

Conclusions
Our study is the first to apply a detailed molecular epidemiological approach to better explore the local transmission characteristics and epidemic pattern of HIV-1 subtypes among various sexual risk groups in Taiyuan City.Our findings emphasize that it is necessary to conduct in-depth research and precise intervention targeting key clusters/individuals, exploring new models based on the HIV-1 molecular transmission network, and implementing measures such as HIV/AIDS detection and exposure prevention to effectively block the ongoing transmission of HIV/AIDS and reduce the incidence of new infections.Understanding these epidemic dynamics in real time is of increasing importance for public health management in terms of guiding prevention efforts.

Figure 2 .
Figure 2. The neighbor-joining phylogenetic tree of HIV-1 pol sequences obtained from Taiyuan.Different colors represented different subtypes; red branches represent immigrant sequences, and black branches show local sequences; the reference sequences are indicated in red font.

Figure 2 .
Figure 2. The neighbor-joining phylogenetic tree of HIV-1 pol sequences obtained from Taiyuan.Different colors represented different subtypes; red branches represent immigrant sequences, and black branches show local sequences; the reference sequences are indicated in red font.

Figure 3 .
Figure 3. Trends in the subtypes of HIV-1 from 2021 to 2022.(A) Trends of total; (B) trends of locals; (C) trends of migrants.

Figure 3 .
Figure 3. Trends in the subtypes of HIV-1 from 2021 to 2022.(A) Trends of total; (B) trends of locals; (C) trends of migrants.

Figure 4 .
Figure 4.The molecular transmission network diagram: (a) HIV-1 molecular clusters coded by subtypes; (b) HIV-1 molecular clusters coded by household registration level.Individuals infected with HIV-1 through homosexual contact are labeled with circles (○), individuals infected with HIV-1 through heterosexual contact are labeled with rectangles (□) and individuals infected with HIV-1 through injection drug use are labeled with rounded rectangles ( ); this study comprises two cases, both belonging to Cluster1.Individuals infected with HIV-1 through mother-to-child transmission are labeled with upward-pointing triangles (△).Migrant subjects are shown in blue, and locals are indicated in green, respectively.Individuals infected with different HIV-1 subtypes are displayed in different colors.

Figure 4 .
Figure 4.The molecular transmission; network diagram: (a) HIV-1 molecular clusters coded by subtypes; (b) HIV-1 molecular clusters coded by household registration level.Individuals infected with HIV-1 through homosexual contact are labeled with circles ( ), individuals infected with HIV-1 through heterosexual contact are labeled with rectangles (□) and individuals infected with HIV-1 through injection drug use are labeled with rounded rectangles (

Figure 5 .
Figure 5. Linkage analysis of different risk behavior groups within the main HIV-1 subtypes in the network.The color indicates the different sexual contact risk groups.(A) Sankey diagram of CRF07_BC.(B) Sankey diagram of CRF01_AE.(C) Sankey diagram of other HIV-1 subtypes.

Figure 5 .
Figure 5. Linkage analysis of different risk behavior groups within the main HIV-1 subtypes in the network.The color indicates the different sexual contact risk groups.(A) Sankey diagram of CRF07_BC.(B) Sankey diagram of CRF01_AE.(C) Sankey diagram of other HIV-1 subtypes.

Table 1 .
Data of newly diagnosed HIV-infected individuals in Taiyuan.

Table 2 .
General distribution of epidemic subtypes in Taiyuan.Numbers in square brackets show the proportion of the cases as a percentage of the total 584 subjects.b Numbers in parentheses show the proportion of HIV-1 subtypes as a percentage of each variable.

Table 3 .
Univariate and multivariate analysis of clustered and non-clustered subjects in the molecular network.

Table 4 .
Characteristics of the large molecular transmission clusters.

Table 4 .
Characteristics of the large molecular transmission clusters.